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D E SCRIP TIO N 

NUCLEIC ACID VACCTN ES AGAINST 
RICKETTSIAL DISEASES AND METHODS OF USE 

This invention was made with government support under US AID Grant No. LAG- 1328- 
G-00-3030-00. The government has certain rights in this invention. 

Cross-Reference to a Related Application 
ITiis is a continuation-in-part of U.S. patent application Serial No. 08/733,230, filed 
October 17, 1996. 

Technical Field 

This invention relates to nucleic acid vaccines for rickettsial diseases of animals, 
including humans. 

Background of the Invention 

The nckettsias are a group of small bacteria commonly transmitted by arthropod vectors 
to man and animals, in which they may cause serious disease. The pathogens causing human 
rickettsial diseases include the agent of epidemic typhus, Rickettsia prowazekii, which has 
resulted in the deaths of millions of people during wartime and natural disasters. The causative 
agents of spotted fever, e.g. ^ Rickettsia rickettsii and Rickettsia conorii, are also included within 
this group. Recently, new types of human rickettsial disease caused by members of the tribe 
Ehrlichiae have been described. Ehrlichiae infect leukocytes and endothelial cells of many 
different mammalian species, some of them causing serious human and veterinary diseases. 
Over 400 cases of human ehrlichiosis, including some fatalities, caused by Ehrlichia chajfeensis 
have now been reported. Clinical signs of human ehrlichiosis are similar to those of Rocky 
Mountain spotted fever, including fever, nausea, vomiting, headache, and rash. 

Heartwater is another infectious disease caused by a rickettsial pathogen, namely 
Cowdria ruminantium, and is transmitted by ticks of the genus Ambiyomma. The disease occurs 
throughout most of Africa and has an estimated endemic area of about 5 million square miles. 
In endemic areas, heartwater is a latent infection in indigenous breeds of cattle that have been 
subjected to centuries of natural selection. The problems occur where the disease contacts 
susceptible or naive cattle and other ruminants. Heartwater has been confirmed to be on the 
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island of Guadeloupe in the Caribbean and is spreading through the Caribbean Islands. The tick 
vectors responsible for spreading this disease are already present on the American mainland and 
threaten the livestock industry in North and South America. 

In acute cases of heartwater, animals exhibit a sudden rise in temperature, signs of 
anorexia, cessation of rumination, and nervous symptoms including staggering, muscle 
twitching, and convulsions. Death usually occurs during these convulsions. Peracutc cases of 
the disease occur where the animal collapses and dies in convulsions having shown no 
preliminary symptoms. Mortality is high in susceptible animals. Angora sheep infected with 
the disease have a 90% mortality rate while susceptible cattle strains have up to a 60% mortality 
rate. 

If detected early, tetracycline or chloramphenicol treatment are effective against 
rickettsial infections, but symptoms are similar to numerous other infections and there are no 
satisfactory diagnostic tests (Helmick, C, K. Bernard, L. D'Angelo [1984] J. Infect. Dis. 
150:480). 

Animals which have recovered from heartwater are resistant to further homologous, and 
in some cases heterologous, strain challenge. It has similarly been found that persons recovering 
from a rickettsial infection may develop a solid and lasting immunity. Individuals recovered 
from natural infections are often immune to multiple isolates and even species. For example, 
guinea pigs immunized with a recombinant R. conorii protein were partially protected even 
against K rickettsii (Vishwanath, S., G. McDonald, N. Watkins [1990] Infect. Immun. 58:646). 
It is known that there is structural variation in rickettsial antigens between different geographical 
isolates. Thus, a functional recombinant vaccine against multiple isolates would need to contain 
multiple epitopes, e.g., protective T and B cell epitopes, shared between isolates. It is believed 
that serum antibodies do not play a significant role in the mechanism of immunity against 
rickettsia (Uilenberg, G. [1983] Advances in Vet. Sci. and Comp. Med. 27:427-480; Du Plessis, 
Plessis, J.L. [1970] Onderstepoort J. Vet. Res. 37(3):147-150). 

Vaccines based on inactivated or attenuated rickettsiae have been developed against 
certain nckettsial diseases, for example against R. prowazekii and R. rickettsii. However, these 
vaccines have major problems or disadvantages, including undesirable toxic reactions, difficulty 
in standardization, and expense (Woodward, T. [1981] "Rickettsial diseases: certain unsettled 
problems in their histoncal perspective," In Rickettsia and Rickettsial Diseases, W. Burgdorfer 
and R. Anacker, eds,, Academic Press, New York, pp. 17-40). 

A vaccine currently used in the control of heartwater is composed of live infected sheep 
blood. This vaccine also has several disadvantages. First, expertise is required for the 
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intravenous inoculation techniques required to administer this vaccine. Second, vaccinated 
animals may experience shock and so require daily monitoring for a period after vaccination. 
There is a possibility of death due to shock throughout this monitoring period, and the drugs 
needed to treat any shock induced by vaccination are costly. Third, blood-bome parasites may 
5 be present in the blood vaccine and be transmitted to the vaccinates. Finally, the blood vaccine 

requires a cold chain to preserve the vaccine. 

Clearly, a safer, more effective vaccine that is easily administered would be particularly 
advantageous. For these reasons, and with the advent of new methods in biotechnology, 
mvestigators have concentrated recently on the development of new types of vaccines, including 

10 recombinant vaccines. However, recombinant vaccine antigens must be carefully selected and 

presented to the immune system such that shared epitopes are recognized. These factors have 
contributed to the search for effective vaccines. 

A protective vaccme against rickettsiae that elicits a complete immune response can be 
advantageous. A few antigens which potentially can be useful as vaccines have now been 

15 identified and sequenced for various pathogenic rickettsia. The genes encoding the antigens and 

that can be employed to recombinantly produce those antigen have also been identified and 
sequenced. Certain protective antigens identified for R. rickettsii, R, conorii, and R. prowazekii 
(e.g., rOmpA and rOmpB) are large (>100 kDa), dependent on retention of native conformation 
for protective efficacy, but are often degraded when produced in recombinant systems. 'Ilns 

20 presents technical and quality-conn-ol problems if purified recombinant proteins are to be 

included in a vaccme. The mode of presentation of a recombinant antigen to the immune system 
can also be an important factor in the immune response. 

Nucleic acid vaccination has been showoi to induce protective immune responses in non- 
viral systems and in diverse animal species (Special Conference Issue, WHO meeting on nucleic 

25 acid vaccines [1994] Vaccine 12:1491). Nucleic acid vaccination has induced cytotoxic 

lymphocyte (CTL), T-helper 1 , and antibody responses, and has been shown to be protective 
against disease (Ulmer, J., J. Donelly, S. Parker et al. [ 1 993] Science 259: 1 745). For example, 
direct intramuscular injection of mice with DNA encoding the influenza nucleoprotein caused 
the production of high titer antibodies, nucleoprotein-specific CTLs, and protection against viral 

30 challenge. Immunization of mice with plasmid DNA encoding the Plasmodium yoelii 

circumsporozoite protein induced high antibody titers against malaria sporozoites and CTLs, and 
protection against challenge infection (Sedegah, M., R. Hedstrom, P. Hobart, S. Hoffman [1994] 
Proc. Natl. Acad. Sci. USA 91:9866). Cattle immunized with plasmids encoding bovine 
herpesvirus 1 (BHV-1) glycoprotein IV developed neutralizing antibody and were partially 
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protected (('ox, G., T. Zamb, L. Babiuk [1993] J. Virol. 67:5664). However, it has been a 
question m the field of immunization whether the recently discovered technology of nucleic acid 
vaccines can provide improved protection against an antigenic drift variant. Moreover, it has 
not heretofore been recognized or suggested that nucleic acid vaccines may be successful to 
5 protect against rickettsial disease or that a major surface protein conserved in rickettsia was 

protective against disease. 



Brief Summarv of the Invention 
Disclosed and claimed here are novel vaccines for conferring immunity to rickettsia 

10 infection, including Cowdria ruminantium causing heartwatcr. Also disclosed are novel nucleic 

acid compositions and methods of using those compositions, including to confer immunity in 
a susceptible host. Also disclosed are novel materials and methods for diagnosing infections by 
Ehrlichia in humans or animals. 

One aspect of the subject invention concerns a nucleic acid, e.g., DNA or mRNA, 

15 vaccine containing the major antigenic protein 1 gene (MAPI ) or the major antigenic protein 

2 gene (MAP2) of rickettsial pathogens. In one embodiment, the nucleic acid vaccines can be 
driven by the human cytomegalovirus (HCMV) enhancer-promoter. In studies immunizing mice 
by intramuscular injection of a DNA vaccine composition according to the subject invention, 
immunized mice seroconverted and reacted with MAPI in antigen blots. Splenocylcs from 

20 immunized mice, but not from control mice immunized with vector only, proliferated in 

response to recombinant MAPI and rickettsial antigens in in vitro lymphocyte proliferation tests, 
In experiments testing different DNA vaccine dose regimens, increased survival rates as 
compared to controls were observed on challenge with rickettsia. Accordingly, the subject 
invention concerns the discovery that DNA vaccines can induce protective immunity against 

25 rickettsial disease or death resulting therefrom. 



Brief Description of the Drawings 
Figures lA-lC show a comparison of the amino acid sequences from alignment of the 
three rickettsial proteins, namely, Cowdria ruminantium (C.r ), Ehrlichia chaffeensis {E.c.\ and 
30 A naplasma marginale (A.m,). 

Figures 2A-2C shows the DNA sequence of the 28 kDa gene locus cloned from E. 
chaffeensis (Fig. 2A-2B) and E. canis (Fig. 2C). One letter amino acid codes for the deduced 
protein sequences are presented below the nucleotide sequence. The proposed sigma-70-like 
promoter sequences (38) are presented in bold and underlined text as -10 and -35 (consensus -35 
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and -10 sequences are TTGACA and TATAAT, respectively). Similarly, consensus ribosomal 
binding sites and transcription terminator sequences (bold letter sequence) are identified. G-rich 
regions identified in the E. chqffeensis sequence are underlined. The conserved sequences from 
within the coding regions selected for RT-PCR assay are identified with italics and underlined 
text. 

Figure 3A shows the complete sequence of the MAP2 homolog of Ehrlichia canis. llie 
arrow (-♦) represents the predicted start of the mature protein. The asterisk (*) represents the 
stop codon. Underlined nucleotides 5' to the open reading frame with -35 and -10 below 
represent predicted promoter sequences. Double underlined nucleotides represent the predicted 
ribosomal binding site. Underlined nucleotides 3' to the open reading frame represent possible 
transcription termination sequences. 

Figure 3B shows the complete sequence of the MAP2 homolog of Ehrlichia chqffeensis. 
The arrow (->) represents the predicted start of the mature protein. The asterisk (*) represents 
the stop codon. Underlined nucleotides 5' to the open reading frame with -35 and -10 below 
represent predicted promoter sequences. Double underlined nucleotides represent the predicted 
ribosomal binding site. Underlined nucleotides 3' to the open reading frame represent possible 
transcription termination sequences. 

Brief Description of the Sequences 
SEQ ID NO. 1 is the coding sequence of the MAPI gene from Cowdria ruminantium 
(Highway isolate). 

SEQ ID NO. 2 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 1. 

SEQ ID NO. 3 IS the coding sequence of the MAPI gene from Ehrlichia chaffeensis. 

SEQ ID NO. 4 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 3. 

SEQ ID NO. 5 is the Anaplasma marginale MSP4 gene coding sequence, 

SEQ ID NO. 6 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 5. 

SEQ ID NO. 7 IS a partial coding sequence of the VSAl gene from Ehrlichia 
chaffeensis, also shown in Figures 2A-2B. 

SEQ ID NO. 8 is the coding sequence of the VSA2 gene from Ehrlichia chaffeensis, 
also shown in Figures 2A-2B. 

SEQ ID NO. 9 is the coding sequence of the VSA3 gene from Ehrlichia chaffeensis, 
also shown in Figures 2A-2B. 

SEQ ID NO. 10 IS the coding sequence of the VSA4 gene from Ehrlichia chaffeensis, 
also shown in Figures 2A-2B. 
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SEQ ID NO. 11 is a partial coding sequence of the VSA5 gene from Ehrlichia 
chaffeensis, also shown m iMgures 2A-2B. 

SEQ II) NO. 12 is the coding sequence of the VSAl gene from Ehrlichia canis, also 
shown in Figure 2C. 

5 SEQ ID NO. 13 is a partial coding sequence of the VSA2 gene from Ehrlichia canis, 

also shown in Figure 2C. 

SEQ ID NO. 14 is the polypeptide encoded by the polynucleotide of SFQ ID NO. 7, 
also shown in Figures 2A-2B. 

SEQ ID NO. 15 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 8, 
10 also shown in Figures 2 A-2B. 

SEQ ID NO. 16 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 9, 
also shown m Figures 2A-2B. 

SEQ ID NO. 17 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 10, 
also shown in Figures 2A-2B. 
15 SEQ ID NO. 18 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 1 1, 

also shown in Figures 2A-2B. 

SEQ ID NO. 19 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 12, 
also shown in Figure 2C, 

SEQ ID NO. 20 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 13, 
20 also shown in Figure 2C. 

SEQ ID NO. 21 is the codmg sequence of the MAP2 gene from Ehrlichia canis, also 
shown in Figure 3 A. 

SEQ ID NO. 22 is the coding sequence of the MAP2 gene from Ehrlichia chaffeensis, 
also shown m Figure 3B. 

25 SEQ ID NO. 23 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 2 1 , 

also shown m Figure 3A. 

SEQ ID NO. 24 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 22, 
also shown in Figure 3B. 



30 Detailed Disclosure of t he Invention 

In one embodiment, the subject invention concerns a novel strategy, termed nucleic acid 
vaccination, for eliciting an immune response protective against rickettsial disease. The subject 
invention also concerns novel compositions that can be employed according to this novel 
strategy for eliciting a protective immune response. According to the subject invention. 
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recombinant plasmid DNA or mRNA encoding an antigen of interest is inoculated directly into 
the human or animal host where the antigen is expressed and an immune response induced. 
Advantageously, problems of protein purification, as can be encountered with antigen delivery 
using live vectors, can be virtually eliminated by employing the compositions or methods 
5 according to the subject invention. Unlike live vector delivery, the subject invention can provide 

a further advantage in that the DNA or RNA does not replicate in the host, but remains episomal 
with gene expression directed for as long as 19 months or more post-injection. See, for example, 
Wolff, J.A., J.J. Ludike, G. Acsadi, P. Williams, A. Jani (1992) Hum. MoL Genet. 1:363. A 
complete immune response can be obtained as recombinant antigen is synthesized intracellularly 
10 and presented to the host immune system in the context of autologous class I and class II MHC 

molecules. 

In one embodiment, the subject invention concerns nucleic acids and compositions 
comprising those nucleic acids that can be effective in protecting an animal from disease or 
death caused by rickettsia. For example, a nucleic acid vaccine of the subject invention has been 

15 shown to be protective against Cowdria ruminantium, the causative agent of heartwater in 

domestic ruminants. Accordingly, DNA sequences of rickettsial genes, e.g., MAPI or 
homologues thereof, can be used as nucleic acid vaccines against human and animal rickettsial 
diseases. The MAPI gene used to obtain this protection is also present in other rickettsiae 
including Anaplasma marginalc, Ehrlichia canis, and in a causative agent of human ehrlichiosis, 

20 Ehrlichia chaffeensis (van Vliel, A., F. Jongejan, M. van Kleef, B. van der Zeijst [1994] Infect. 

Immun. 62: 145 1 ). The MAPI gene or a MAPI-like gene can also be found in certain Rickettsia 
spp. MAPI -like genes from Ehrlichia chaffeensis and Ehrlichia canis have now been cloned 
and sequenced. These MAP-1 homologs are also referred to herein as Variable Surface Antigen 
(VSA) genes. 

25 The present invention also concerns polynucleotides encoding MAP2 or MAP2 

homologs from Ehrlichia canis and Ehrlichia chaffeensis. MAP2 polynucleotide sequences of 
the invention can be used as vaccine compositions and in diagnostic assays. The polynucleotides 
can also be used to produce the MAP2 polypeptides encoded thereby. 

Compositions comprising the subject polynucleotides can include appropriate nucleic 

30 acid vaccine vectors (plasmids), which are commercially available (e.g., Vical, San Diego, CA). 

In addition, the compositions can include a pharmaccutically acceptable carrier, e.g. , saline. The 
pharmaceutically acceptable carriers are well known in the art and also are commercially 
available. For example, such acceptable carriers are described in E.W. Martin's Remington's 
Pharmaceutical Science, Mack Publishing Company, Easton, PA. 
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The subject invention also concerns polypeptides encoded by the subject 
polynucleotides. Specifically exemplified are the polypeptides encoded by the MAP-1 and VSA 
genes of C nmimontium, E. chajfeensis, E. canis and the MP4 gene oi Anaplasma marginaie. 
Polypeptides uncoded by E. chaffcansis and E. canis MAP2 genes are also exemplified herein. 

5 

Also encompassed within the scope of the present invention are fragments and variants 
of the exemplified polynucleotides. Variants include polynucleotides and/or polypeptides 
having base or amino acid additions, deletions and substitutions in the sequence of the subject 
molecule so long as those variants have substantially the same activity or serologic reactivity 

10 as the native molecules. Also included are allelic variants of the subject polynucleotides. The 

polypeptides and peptides of the present invention can be used to raise antibodies that are 
reactive with the polypeptides disclosed herein. The polypeptides and peptides can also be used 
as molecular weight markers. 

Another aspect of the subject invention concerns antibodies reactive with MAI'-l and 

15 MAP2 polypeptides disclosed herein. Antibodies can be monoclonal or polyclonal and can be 

produced using standard techniques known in the art. Antibodies of the invention can be used 
in diagnostic and therapeutic applications. 

In a specific embodiment, the subject invention concerns a DNA vaccine (e.g., 
VCLIOIO/MAPI ) containing the major antigenic protein 1 gene (MAPI ) driven by the human 

20 cytomegalovirus (MCMV) enhancer-promoter injected intramuscularly into 8-10 week-old 

female DBA/2 mice after treating them with 50 [il/muscle of 0.5% bupivacame 3 days 
previously. Up to 75% of the VCLIOIO/MAPI -immunized mice serocon verted and reacted with 
MAPI in antigen blots. Splenocytes from immunized mice, but not from control mice 
immunized with VCLlOlO DNA (plasmid vector, Vical, San Diego) proliferated m response to 

25 recombinant MAPI and C. ruminantium antigens in in vitro lymphocyte proliferation tests. 

These proliferating cells from mice immunized with VCLIOIO/MAPI DNA secreted IFN- 
gamma and IL-2 at concentrations ranging from 610 pg/ml and 1 52 pg/ml to 1290 pg/ml and 3 1 0 
pg/ml, respectively. In experiments testing different VCLIOIO/MAPI DNA vaccine dose 
regimens (25-100 ^g/dose, 2 or 4 immunizations), survival rates of 23% to 88% (35/92 

30 survivors/total in all VCLIOIO/MAPI immunized groups) were observed on challenge with 

30LD50 of C ruminantium. Survival rates of 0% to 3% (1/144 survivors/total in all control 
groups) were recorded for control mice immunized similarly with VCLlOlO DNA or saline. 
Accordingly, the subject invention concerns the discovery that the gene encoding the MAPI 
protein can induce protective immunity as a DNA vaccine against rickettsial disease. 
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The nucleic acid sequences described herein have other uses as well. For example, the 
nucleic acids of the subject invention can be useful as probes to identify complementar}' 
sequences within other nucleic acid molecules or genomes. Such use of probes can be applied 
to identify or distinguish infectious strains of organisms in diagnostic procedures or in rickettsial 
5 research where identification of particular organisms or strains is needed. As is well Icnown in 

the art, probes can be made by labeling the nucleic acid sequences of interest according to 
accepted nucleic acid labeling procedures and techniques. A person of ordinary skill in the art 
would recognize that variations or fragments of the disclosed sequences which can specifically 
and selectively hybridize to the DNA of rickettsia can also function as a probe. It is within the 
1 0 ordinary skill of persons in the art, and does not require undue experimentation m view of the 

description provided herein, to determine whether a segment of the claimed DNA sequences is 
a fragment or variant which has characteristics of the full sequence, e.g., whether it specifically 
and selectively hybridizes or can confer protection against rickettsial infection in accordance 
with the subject invention. In addition, with the benefit of the subject disclosure describing the 
15 specific sequences, it is within the ordinary skill of those persons in the art to label hybridizing 

sequences to produce a probe. 

It is also well known in the art that restriction enzymes can be used to obtain functional 
fragments of the subject DNA sequences. For example, Bal3 1 exonuclease can be conveniently 
used for time-controlled limited digestion of DNA (commonly referred to as "erase-a-base" 
20 procedures). See, for example, Maniatis et al ( 1 982) Molecular Cloning: A Laboratory Manual, 

Cold Spring Harbor Laboratory, New York; Wei et al. (1983) 7. Biol. Chem. 258:1 3006- 13512. 

In addition, the nucleic acid sequences of the subject invention can be used as molecular 
weight markers in nucleic acid analysis procedures. 

25 Following are examples which illustrate procedures for practicing the invention. These 

examples should not be construed as limiting. All percentages are by weight and all solvent 
mixture proportions are by volume unless otherwise noted. 

Example 1 

30 A nucleic acid vaccine construct was tested in animals for its ability to protect against 

death caused by infection with the rickettsia Cowdria ruminantium. The vaccine construct tested 
was the MAPI gene of C ruminantium inserted intoplasmid VCLlOlO (Vical, San Diego) under 
control of the human cytomegalovirus promoter-enhancer and intron A. In this study, seven 
groups containing 10 mice each were injected twice at 2-week intervals with either 100, 75, 50, 
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or 25 VCLIOIO/MAPI DNA (V/M in Table 1 below), or 100, 50 Mg VCLIOlO DNA (V in 
Table I) or saline (Sal.), respectively. Two weeks after the last injections, 8 mice/group were 
challenged with 30LD50 of C. ruminantium and clinical symptoms and survival monitored. The 
remaining 2 mice/group were not challenged and were used for lymphocyte proliferation tests 
5 and cytokine measurements. The results of the study are summarized in Table 1 , below: 



Table 1 




100 )ig 




50 ng 


25 )ig 


100 ^ig 


50 ^g 






V/M 


V/M 


V/M 


V/M 


V 


V 


Sal. 


Survived 


5 


7 


5 


3 


0 


0 


0 


Died 


3 


1 


3 


5 


8 


8 


8 



The VCLIOIO/MAPI nucleic acid vaccine increased survival on challenge in all groups, with 
a total of 20/30 mice surviving compared to 0/24 in the control groups. 
15 This study was repeated with another 6 groups, each containing 33 mice (a total of 198 

mice). Three groups received 75 ^g VCLIOIO/MAPI DNA or VCLlOlO DNA or saline (4 
injections in all cases). Two weeks after the last injection, 30 mice/group were challenged with 
30LD50 of C. ruminantium and 3 mice/group were sacrificed for lymphocyte proliferation tests 
and cytokine measurements. The results of this study are summarized in Table 2, below: 

20 



Table 2 


V/M2inj. 


V2inj. 


Sal. 2inj. 


V/M4inj. 


V4inj. 


Sal.4inj. 


Survived 7 


0 


0 


8 


0 


1 


Died* 23 


30 


30 


22 


30 


29 



25 

*In mice that died in both V/M groups, there was an increase in mean survival time of 
approximately 4 days compared to the controls (p<0.05). 



Again, as summarized in Table 2, the VCLIOIO/MAPI DNA vaccine increased the 
30 numbers of mice surviving in both immunized groups, although there was no apparent benefit 

of 2 additional injections. In these two experiments, there were a cumulative total of 35/92 
(38%) surviving mice in groups receiving the VCLIOIO/MAPI DNA vaccine compared to 1/144 
(0.7%) surviving mice in the control groups. In both immunization and challenge trials 
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described above, splenocytes from VCLIOIO/MAPI immunized mice, but not from control 
mice, specifically proliferated lo recombinant MAPI protein and to C. nminantium in 
lymphocyte proliferation tests. These proliferating splenocytes secreted IL-2 and gamma- 
interferon at concentrations up to 310 and 1290 pg/ml respectively. These data show thai 
5 protection against rickettsial infections can be achieved with a DNA vaccine. In addition, these 

experiments show MAPI -related proteins as vaccine targets. 

Example 2 

The MAPI protein of C. ruminantium has significant similarity to MSP4 of A. 

10 marginale, and related molecules may also be presenting other rickettsial pathogens. To prove 

this, we used primers based on regions conserved between C. ruminantium and A. marginale in 
PGR to clone a MAPI -like gene from E. chaffeensis. The amino acid sequence derived from 
the cloned chaffeensis MAP I -like gene, and alignment with the corresponding genes of C. 
ruminantium and A. marginale is shown in Figure 1. We have now identified the regions of 

1 5 MAPI -like genes which are highly conserved between Ehrlichia, Cowdria, and Anaplasma and 

which can allow cloning of the analogous genes from other rickettsiae. 

Example 3 - C lonmg and sequence analvsis of MAPI homoloe uc genes of £. chaffeensis and 
E. canis 

20 Genes homologous to the major surface protein of C ruminantium MAPI were cloned 

from E. chaffeensis and E. canis by using PGR cloning strategies. The cloned segments 
represent a 4.6 kb genomic locus of £. chaffeensis and a 1 .6 kb locus of £. canis. DNA sequence 
generated from these clones was assembled and is presented along with the deduced amino acid 
sequence m Figures 2A-2B (SEQ ID NOs. 7-11 and 14-18) and Figure 2G (SEQ ID NOs. 12-13 

25 and 19-20). Significant features of the DNA include five very similar but nonidentical open 

reading frames (ORFs) for E. chaffeensis and two very similar, nonidentical ORFs for the E. 
canis cloned locus. The ORFs for both Ehrlichia spp. are separated by noncoding sequences 
ranging from 264 to 310 base pairs. The noncoding sequences have a higher A+T content 
(71.6% for E. chaffeensis and 76.1% for E, canis) than do the coding sequences (63.5% fovE. 

30 chaffeensis and 68.0% for E. canis). A G-rich region -200 bases upstream from the initiation 

codon, sigma-70-like promoter sequences, putative ribosome binding sites (RBS), termination 
codons, and palindromic sequences near the termination codons are found in each of the E. 
chaffeensis noncoding sequences. The E. canis noncoding sequence has the same feature except 
for the G-rich region (Figure 2G; SEQ ID NOs, 12-13 and 19-20). 
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Sequence comparisons of the ORFs at the nucleotide and translated amino acid levels 
revealed a high degree of similarity between them. The similarity spanned the entire coding 
sequences, except in three regions where notable sequence variations were observed including 
some deletions/insertions (Variable Regions 1, 11 and III). Despite the similarities, no two ORFs 
5 are identical. The cloned ORF 2, 3 and 4 off. chaffeensis have complete coding sequences. 

The ORFl is a partial gene having only 143 amino acids at the C-terminus whereas the ORFS 
is nearly complete but lacks 5-7 amino acids and a termination codon. The cloned ORF2 off. 
canis also is a partial gene lacking a part of the C-terminal sequence. The overall similarity 
between different ORFs at the amino acid level is 56.0% to 85.4% for E. chaffeensis, whereas 

10 for E. canis it is 53.3%. The similarity of E. chaffeensis ORFs to the MAPI coding sequences 

reported for C. ruminantium isolates ranged from 55.5% to 66.7%, while for E. canis to C. 
ruminantium it is 48.5% to 54.2%. Due to their high degree of similarity to MAPI surface 
antigen genes of C. ruminantium and since they are nonidentical to each other, the E. chaffeensis 
and E. canis ORFs are referred to herein as putative Variable Surface Antigen (VSA) genes. The 

15 apparent molecular masses of the predicted mature proteins of E. chaffeefisis were 28.75 kDa 

for VSA2, 27.78 for VSA3, and 27.95 for VSA4, while £. canis VSAl was slightly higher at 
29.03 kDa. The first 25 ammo acids m each VSA coding sequence were eliminated when 
calculating the protein size since they markedly resembled the signal sequence of C. 
ruminantium MAPI and presumably would be absent from the mature protein. Predicted protein 

20 sizes for E. chaffeensis VSAl and VSA5, and E. canis VSA2 were not calculated since the 

complete genes were not cloned. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
25 suggested to persons skilled in the art and are to be included within the spirit and purview of this 

application and the scope of the appended claims. 
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SEQUENCE LISTING 



(!) GENERAL INFORMATION: 



(i) APPLICANT: 



Applicant Name(s) ; 
Street Address : 



University of Florida 
223 Grinter Hall 
Gainesville 
Florida 



City: 



State/Province : 



Country : 



US 



Postal Code/Zip: 



32611 

(352) 392-8929 



Phone number: 



Fax: (352) 392-6600 



(ii) TITLE OF INVENTION: Nucleic Acid Vaccines Against 
Rickettsial Diseases and Methods of Use 

(iii) NUMBER OF SEQUENCES: 24 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Saliwanchik, Lloyd k Saliwanchik 

(B) STREET: 2421 N.W. 41st Street, Suite A-1 

(C) CITY: Gainesville 

(D) STATE: FL 

(E) COUNTRY: USA 

(F) ZIP: 32606 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT 

(B) FILING DATE: 17 October 1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Pace, Doran R. 

(B) REGISTRATION NUMBER: 38,261 

(C) REFERENCE/DOCKET NUMBER: UF-167C1 

(ix) TELECOMMUNICATION INFORMATION: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 864 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(A) TELEPHONE: 352-375-8100 

(B) TELEFAX: 352-372-5800 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..861 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG AAT TGC AAG AAA ATT TTT ATC ACA ACT ACA CTA ATA TCA TTA GTG 4 8 

Met Asn Cys Lys Lys lie Phe lie Thr Ser Thr Leu lie Ser Leu Val 
15 10 15 

TCA TTT TTA CCT GOT GTG TCC TTT TCT GAT GTA ATA CAG GAA GAC AGC 96 
Ser Phe Leu Pro Gly Val Ser Phe Ser Asp Val lie Gin Glu Asp Ser 
20 25 30 

AAC CCA GCA GGC AGT GTT TAC ATT AGC GCA AAA TAC ATG CCA ACT GCA 144 
Asn Pro Ala Gly Ser Val Tyr lie Ser Ala Lys Tyr Met Pro Thr Ala 
35 40 45 

TCA CAT TTT GGT AAA ATG TCA ATC AAA GAA GAT TCA AAA AAT ACT CAA 192 
Ser His Phe Gly Lys Met Ser lie Lys Glu Asp Ser Lys Asn Thr Gin 
50 55 60 

ACG GTA TTT GGT CTA AAA AAA GAT TGG GAT GGC GTT AAA ACA CCA TCA 240 
Thr Val Phe Gly Leu Lys Lys Asp Trp Asp Gly Val Lys Thr Pro Ser 
65 70 75 80 

GAT TCT AGC AAT ACT AAT TCT ACA ATT TTT ACT GAA AAA GAC TAT TCT 2 88 

Asp Ser Ser Asn Thr Asn Ser Thr lie Phe Thr Glu Lys Asp Tyr Ser 
85 90 95 

TTC AGA TAT GAA AAC AAT CCG TTT TTA GGT TTC GCT GGA GCA ATT GGG 3 36 

Phe Arg Tyr Glu Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala lie Gly 
100 105 110 

TAC TCA ATG AAT GGA CCA AGA ATA GAG TTC GAA GTA TCC TAT GAA ACT 384 
Tyr Ser Met Asn Gly Pro Arg lie Glu Phe Glu Val Ser Tyr Glu Thr 
115 120 125 

TTT GAT GTA AAA AAC CTA GGT GGC AAC TAT AAA AAC AAC GCA CAC ATG 432 
Phe Asp Val Lys Asn Leu Gly Gly Asn Tyr Lys Asn Asn Ala His Met 
130 135 140 

TAC TGT GCT TTA GAT ACA GCA GCA CAA AAT AGC ACT AAT GGC GCA GGA 480 
Tyr Cys Ala Leu Asp Thr Ala Ala Gin Asn Ser Thr Asn Gly Ala Gly 
145 150 155 160 

TTA ACT ACA TCT GTT ATG GTA AAA AAC GAA AAT TTA ACA AAT ATA TCA 52 8 

Leu Thr Thr Ser Val Met Val Lys Asn Glu Asn Leu Thr Asn He Ser 
165 170 175 

TTA ATG TTA AAT GCG TGT TAT GAT ATC ATG CTT GAT GGA ATA CCA GTT 576 
Leu Met Leu Asn Ala Cys Tyr Asp He Met Leu Asp Gly He Pro Val 
180 185 190 
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TCT CCA TAT GTA TGT GCA GGT ATT GGC ACT GAC TTA GTG TCA GTA ATT 624 
Ser Pro Tyr Val Cys Ala Gly lie Gly Thr Asp Leu Val Ser Val He 
195 200 205 

AAT GCT ACA AAT CCT AAA TTA TCT TAT CAA GGA AAG CTA GGC ATA AGT 672 
Asn Ala Thr Asn Pro Lys Leu Ser Tyr Gin Gly Lys Leu Gly He Ser 
210 215 220 

TAG TCA ATC AAT TCT GAA GCT TCT ATC TTT ATC GGT GGA CAT TTC CAT 72 0 

Tyr Ser He Asn Ser Glu Ala Ser He Phe He Gly Gly His Phe His 
225 230 235 240 

AGA GTT ATA GGT AAT GAA TTT AAA GAT ATT GCT ACC TTA AAA ATA TTT 768 
Arg Val He Gly Asn Glu Phe Lys Asp He Ala Thr Leu Lys He Phe 

245 250 255 

ACT TCA AAA ACA GGA ATA TCT AAT CCT GGC TTT GCA TCA GCA ACA CTT 816 
Thr Ser Lys Thr Gly He Ser Asn Pro Gly Phe Ala Ser Ala Thr Leu 
260 265 270 

GAT GTT TGT CAC TTT GGT ATA GAA ATT GGA GGA AGG TTT GTA TTT 861 
Asp Val Cys His Phe Gly He Glu He Gly Gly Arg Phe Val Phe 
275 280 285 

TAA 864 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 87 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Asn Cys Lys Lys He Phe He Thr Ser Thr Leu He Ser Leu Val 
15 10 15 

Ser Phe Leu Pro Gly Val Ser Phe Ser Asp Val He Gin Glu Asp Ser 
20 25 30 

Asn Pro Ala Gly Ser Val Tyr He Ser Ala Lys Tyr Met Pro Thr Ala 
35 40 45 

Ser His Phe Gly Lys Met Ser He Lys Glu Asp Ser Lys Asn Thr Gin 
50 55 60 

Thr Val Phe Gly Leu Lys Lys Asp Trp Asp Gly Val Lys Thr Pro Ser 
65 70 75 80 



Asp Ser Ser Asn Thr Asn Ser Thr He Phe Thr Glu Lys Asp Tyr Ser 
85 90 95 
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Phe Arg Tyr Glu Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala He Gly 
100 105 110 

Tyr Ser Met Asn Gly Pro Arg He Glu Phe Glu Val Ser Tyr Glu Thr 
115 120 125 

Phe Asp Val Lys Asn Leu Gly Gly Asn Tyr Lys Asn Asn Ala His Met 
130 135 140 



Tyr Cys Ala Leu 
145 

Leu Thr Thr Ser 



Leu Met Leu Asn 
180 

Ser Pro Tyr Val 
195 

Asn Ala Thr Asn 
210 

Tyr Ser He Asn 
225 

Arg Val He Gly 



Asp Thr Ala Ala 
150 

Val Met Val Lys 
165 

Ala Cys Tyr Asp 



Cys Ala Gly He 
200 

Pro Lys Leu Ser 
215 

Ser Glu Ala Ser 
230 

Asn Glu Phe Lys 
245 



Gin Asn Ser Thr 
155 

Asn Glu Asn Leu 
170 

He Met Leu Asp 
185 

Gly Thr Asp Leu 



Tyr Gin Gly Lys 
220 

He Phe He Gly 
235 

Asp He Ala Thr 
250 



Asn Gly Ala Gly 
160 

Thr Asn He Ser 
175 

Gly He Pro Val 
190 

Val Ser Val He 
205 

Leu Gly He Ser 



Gly His Phe His 
240 

Leu Lys He Phe 
255 



Thr Ser Lys Thr 
260 

Asp Val Cys His 
275 



Gly He Ser Asn 

Phe Gly He Glu 
280 



Pro Gly Phe Ala 
265 

He Gly Gly Arg 



Ser Ala Thr Leu 
270 

Phe Val Phe 
285 



{2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..840 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



wo 98/16554 



PCT/US97/19044 



17 

ATG AAT TAC AAA AAA ACT TTC ATA ACA GCG ATT GAT ATC ATT AAT ATC 4 8 

Met Asn Tyr Lys Lys Ser Phe lie Thr Ala lie Asp lie lie Asn lie 
290 295 300 

CTT CTC TTA CCT GGA GTA TCA TTT TCC GAC CCA AGG CAG GTA GTG GTC 96 
Leu Leu Leu Pro Gly Val Ser Phe Ser Asp Pro Arg Gin Val Val Val 
305 310 315 

ATT AAC GGT AAT TTC TAC ATC AGT GGA AAA TAC GAT GCC AAG GCT TCG 144 
lie Asn Gly Asn Phe Tyr lie Ser Gly Lys Tyr Asp Ala Lys Ala Ser 
320 325 330 335 

CAT TTT GGA GTA TTC TCT GCT AAG GAA GAA AGA AAT ACA ACA GTT GGA 192 

His Phe Gly Val Phe Ser Ala Lys Glu Glu Arg Asn Thr Thr Val Gly 
340 345 350 

GTG TTT GGA CTG AAG CAA AAT TGG GAC GGA AGC GCA ATA TCC AAC TCC 24 0 

Val Phe Gly Leu Lys Gin Asn Trp Asp Gly Ser Ala lie Ser Asn Ser 
355 360 365 

TCC CCA AAC GAT GTA TTC ACT GTC TCA AAT TAT TCA TTT AAA TAT GAA 288 
Ser Pro Asn Asp Val Phe Thr Val Ser Asn Tyr Ser Phe Lys Tyr Glu 
370 375 380 

AAC AAC CCG TTT TTA GGT TTT GCA GGA GCT ATT GGT TAC TCA ATG GAT 336 
Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala lie Gly Tyr Ser Met Asp 
385 390 395 

GGT CCA AGA ATA GAG CTT GAA GTA TCT TAT GAA ACA TTT GAT GTA AAA 3 84 

Gly Pro Arg lie Glu Leu Glu Val Ser Tyr Glu Thr Phe Asp Val Lys 
400 405 410 415 

AAT CAA GGT AAC AAT TAT AAG AAT GAA GCA CAT AGA TAT TGT GCT CTA 432 
Asn Gin Gly Asn Asn Tyr Lys Asn Glu Ala His Arg Tyr Cys Ala Leu 
420 425 430 

TCC CAT AAC TCA GCA GCA GAC ATG AGT AGT GCA AGT AAT AAT TTT GTC 4 80 

Ser His Asn Ser Ala Ala Asp Met Ser Ser Ala Ser Asn Asn Phe Val 
435 440 445 

TTT CTA AAA AAT GAA GGA TTA CTT GAC ATA TCA TTT ATG CTG AAC GCA 528 
Phe Leu Lys Asn Glu Gly Leu Leu Asp lie Ser Phe Met Leu Asn Ala 
450 455 460 

TGC TAT GAC GTA GTA GGC GAA GGC ATA CCT TTT TCT CCT TAT ATA TGC 576 
Cys Tyr Asp Val Val Gly Glu Gly He Pro Phe Ser Pro Tyr He Cys 
465 470 475 

GCA GGT ATC GGT ACT GAT TTA GTA TCC ATG TTT GAA GCT ACA AAT CCT 624 
Ala Gly lie Gly Thr Asp Leu Val Ser Met Phe Glu Ala Thr Asn Pro 
480 485 490 495 

AAA ATT TCT TAC CAA GGA AAG TTA GGT TTA AGC TAC TCT ATA AGC CCA 672 
Lys He Ser Tyr Gin Gly Lys Leu Gly Leu Ser Tyr Ser He Ser Pro 
500 505 510 
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GAA GCT TCT GTG TTT ATT 
Glu Ala Ser Val Phe He 
515 

GAA TTT AGA GAT ATT CCT 
Glu Phe Arg Asp He Pro 
530 

GGA AAA GGA AAC TAG CCT 
Gly Lys Gly Asn Tyr Pro 
545 

GGA ATA GAA ATG GGA GGA 
Gly He Glu Met Gly Gly 
560 565 



GOT GGG CAC TTT CAT AAG 
Gly Gly His Phe His Lys 
520 

ACT ATA ATA CCT ACT GGA 
Thr He He Pro Thr Gly 
535 

GCA ATA GTA ATA CTG GAT 
Ala He Val He Leu Asp 
550 555 

AGG TTT AA 
Arg Phe 



GTA ATA GGG AAC 72 0 

Val He Gly Asn 
525 

TCA AC A CTT GCA 768 

Ser Thr Leu Ala 

540 

GTA TGC CAC TTT 816 
Val Cys His Phe 



842 



(2} INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 280 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Asn Tyr Lys Lys Ser Phe He Thr Ala He Asp He He Asn He 
15 10 15 

Leu Leu Leu Pro Gly Val Ser Phe Ser Asp Pro Arg Gin Val Val Val 
20 25 30 

He Asn Gly Asn Phe Tyr He Ser Gly Lys Tyr Asp Ala Lys Ala Ser 
35 40 45 

His Phe Gly Val Phe Ser Ala Lys Glu Glu Arg Asn Thr Thr Val Gly 
50 55 60 

Val Phe Gly Leu Lys Gin Asn Trp Asp Gly Ser Ala He Ser Asn Ser 
65 70 75 80 

Ser Pro Asn Asp Val Phe Thr Val Ser Asn Tyr Ser Phe Lys Tyr Glu 
85 90 95 

Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala He Gly Tyr Ser Met Asp 
100 105 110 

Gly Pro Arg He Glu Leu Glu Val Ser Tyr Glu Thr Phe Asp Val Lys 
115 120 125 



Asn Gin Gly Asn Asn Tyr Lys Asn Glu Ala His Arg Tyr Cys Ala Leu 
130 135 140 
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Ser His Asn Ser 
145 

Phe Leu Lys Asn 



Cys Tyr Asp Val 
180 

Ala Gly lie Gly 
195 

Lys lie Ser Tyr 
210 

Glu Ala Ser Val 
225 

Glu Phe Arg Asp 



Gly Lys Gly Asn 
260 

Gly He Glu Met 
275 



Ala Ala Asp Met 
150 

Glu Gly Leu Leu 

165 

Val Gly Glu Gly 



Thr Asp Leu Val 
200 

Gin Gly Lys Leu 
215 

Phe He Gly Gly 
230 

He Pro Thr He 
245 

Tyr Pro Ala He 



Gly Gly Arg Phe 
280 



Ser Ser Ala Ser 
155 

Asp He Ser Phe 

170 

He Pro Phe Ser 
185 

Ser Met Phe Glu 



Gly Leu Ser Tyr 
220 

His Phe His Lys 
235 

He Pro Thr Gly 
250 

Val He Leu Asp 
265 



Asn Asn Phe Val 
160 

Met Leu Asn Ala 

175 

Pro Tyr He Cys 
190 

Ala Thr Asn Pro 
205 

Ser He Ser Pro 



Val He Gly Asn 
240 

Ser Thr Leu Ala 

255 

Val Cys His Phe 
270 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 849 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

{ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..846 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

ATGAAT TAG AGA GAA TTG TTT ACA GGG GGC CTG TCA GCA GCC ACA GTC 48 
Met Asn Tyr Arg Glu Leu Phe Thr Gly Gly Leu Ser Ala Ala Thr Val 
285 290 295 

TGC GCC TGC TCC CTA CTT GTT AGT GGG GCC GTA GTG GCA TCT CCC ATG 96 
Cys Ala Cys Ser Leu Leu Val Ser Gly Ala Val Val Ala Ser Pro Met 
300 305 310 



AGT CAC GAA GTG GCT TCT GAA GGG GGA GTA ATG GGA GGT AGC TTT TAC 144 
Ser His Glu Val Ala Ser Glu Gly Gly Val Met Gly Gly Ser Phe Tyr 
315 320 325 
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GTG GGT GCG GCC TAC AGC CCA GCA TTT CCT TCT GTT ACC TCG TTC GAC 192 
Val Gly Ala Ala Tyr Ser Pro Ala Phe Pro Ser Val Thr Ser Phe Asp 
330 335 340 

ATG CGT GAG TCA AGC AAA GAG ACC TCA TAC GTT AGA GGC TAT GAC AAG 24 0 

Met Arg Glu Ser Ser Lys Glu Thr Ser Tyr Val Arg Gly Tyr Asp Lys 
345 350 355 360 

AGC ATT GCA ACG ATT GAT GTG AGT GTG CCA GCA AAC TTT TCC AAA TCT 2 88 

Ser lie Ala Thr He Asp Val Ser Val Pro Ala Asn Phe Ser Lys Ser 
365 370 375 

GGC TAC ACT TTT GCC TTC TCT AAA AAC TTA ATC ACG TCT TTC GAC GGC 336 
Gly Tyr Thr Phe Ala Phe Ser Lys Asn Leu He Thr Ser Phe Asp Gly 
380 385 390 

GCT GTG GGA TAT TCT CTG GGA GGA GCC AGA GTG GAA TTG GAA GCG AGC 3 84 

Ala Val Gly Tyr Ser Leu Gly Gly Ala Arg Val Glu Leu Glu Ala Ser 
395 400 405 

TAC AGA AGG TTT GCT ACT TTG GCG GAC GGG CAG TAC GCA AAA AGT GGT 4 32 

Tyr Arg Arg Phe Ala Thr Leu Ala Asp Gly Gin Tyr Ala Lys Ser Gly 
410 415 420 

GCG GAA TCT CTG GCA GCT ATT ACC CGC GAC GCT AAC ATT ACT GAG ACC 4 80 

Ala Glu Ser Leu Ala Ala He Thr Arg Asp Ala Asn He Thr Glu Thr 
425 430 435 440 

AAT TAC TTC GTA GTC AAA ATT GAT GAA ATC ACA AAC ACC TCA GTC ATG 52 8 

Asn Tyr Phe Val Val Lys He Asp Glu He Thr Asn Thr Ser Val Met 
445 450 455 

TTA AAT GGC TGC TAT GAC GTG CTG CAC ACA GAT TTA CCT GTG TCC CCG 576 
Leu Asn Gly Cys Tyr Asp Val Leu His Thr Asp Leu Pro Val Ser Pro 
460 465 470 

TAT GTA TGT GCC GGG ATA GGC GCA AGC TTT GTT GAC ATC TCT AAG CAA 624 
Tyr Val Cys Ala Gly He Gly Ala Ser Phe Val Asp He Ser Lys Gin 
475 480 485 

GTA ACC ACA AAG CTG GCC TAC AGG GGC AAG GTT GGG ATT AGC TAC CAG 672 
Val Thr Thr Lys Leu Ala Tyr Arg Gly Lys Val Gly He Ser Tyr Gin 
490 495 500 

TTT ACT CCG GAA ATA TCC TTG GTG GCA GGT GGG TTC TAC CAC GGG CTA 72 0 

Phe Thr Pro Glu He Ser Leu Val Ala Gly Gly Phe Tyr His Gly Leu 
505 510 515 520 

TTT GAT GAG TCT TAC AAG GAC ATT CCC GCA CAC AAC AGT GTA AAG TTC 768 
Phe Asp Glu Ser Tyr Lys Asp He Pro Ala His Asn Ser Val Lys Phe 
525 530 535 

TCT GGA GAA GCA AAA GCC TCA GTC AAA GCG CAT ATT GCT GAC TAC GGC 816 
Ser Gly Glu Ala Lys Ala Ser Val Lys Ala His He Ala Asp Tyr Gly 
540 545 550 
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TTT AAC CTT GGA GCA AGA TTC CTG TTC AGC TAA 
Phe Asn Leu Gly Ala Arg Phe Leu Phe Ser 
555 560 



849 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 82 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Asn Tyr Arg Glu Leu Phe Thr Gly Gly Leu Ser Ala Ala Thr Val 
15 10 15 

Cys Ala Cys Ser Leu Leu Val Ser Gly Ala Val Val Ala Ser Pro Met 
20 25 30 

Ser His Glu Val Ala Ser Glu Gly Gly Val Met Gly Gly Ser Phe Tyr 
35 40 45 

Val Gly Ala Ala Tyr Ser Pro Ala Phe Pro Ser Val Thr Ser Phe Asp 
50 55 60 

Met Arg Glu Ser Ser Lys Glu Thr Ser Tyr Val Arg Gly Tyr Asp Lys 
65 70 75 80 

Ser lie Ala Thr He Asp Val Ser Val Pro Ala Asn Phe Ser Lys Ser 
85 90 95 

Gly Tyr Thr Phe Ala Phe Ser Lys Asn Leu He Thr Ser Phe Asp Gly 
100 105 110 

Ala Val Gly Tyr Ser Leu Gly Gly Ala Arg Val Glu Leu Glu Ala Ser 
115 120 125 

Tyr Arg Arg Phe Ala Thr Leu Ala Asp Gly Gin Tyr Ala Lys Ser Gly 
130 135 140 

Ala Glu Ser Leu Ala Ala He Thr Arg Asp Ala Asn He Thr Glu Thr 
145 150 155 160 

Asn Tyr Phe Val Val Lys He Asp Glu He Thr Asn Thr Ser Val Met 
165 170 175 

Leu Asn Gly Cys Tyr Asp Val Leu His Thr Asp Leu Pro Val Ser Pro 
180 185 190 



Tyr Val Cys Ala Gly He Gly Ala Ser Phe Val Asp He Ser Lys Gin 
195 200 205 
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Val Thr Thr Lys Leu Ala Tyr Arg 
210 215 

Phe Thr Pro Glu He Ser Leu Val 
225 230 

Phe Asp Glu Ser Tyr Lys Asp He 
245 

Ser Gly Glu Ala Lys Ala Ser Val 
260 



Gly Lys Val Gly He Ser Tyr Gin 
220 

Ala Gly Gly Phe Tyr His Gly Leu 
235 240 

Pro Ala His Asn Ser Val Lys Phe 
250 255 

Lys Ala His lie Ala Asp Tyr Gly 
265 270 



Phe Asn Leu Gly Ala Arg Phe Leu Phe Ser 
275 280 



wo 98/16554 



PCT/US97/19044 



23 
Claims 



1 1 . A composition comprising a polynucleotide which encodes a polypeptide having the 

2 characteristic of eliciting an immune response protective against disease or death caused by a 

3 rickettsial pathogen. 

1 2. The composition, according to claim 1, wherein said rickettsial pathogen is selected 

2 from the group consisting of Rickettsia spp., Ehrlichia spp., Anaplasma spp., and Cowdria spp. 

1 3. The composition, according to claim 1 , wherein said polypeptide has an amino acid 

2 sequence selected from the group consistmg of SKQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, 

3 SEQ ID NO. 14, SEQ ID NO. 15, SEQ ID NOS. 16-20, SEQ ID NO. 23, and SEQ ID NO. 24, 

4 or a fragment thereof. 

1 4. The composition, according to claim 1, wherem said polynucleotide has a nucleic 

2 acid sequence selected from the group consisting of SEQ ID NO. 1, SEQ ID NO. 3, SEQ ID NO. 

3 5, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NOS. 9-13, SEQ ID NO. 21, and SEQ ID NO. 22, 

4 or a fragment thereof. 

1 5. The composition, according to claim 4, wherein said polynucleotide has a nucleic 

2 acid sequence of SEQ ID NO. 3, or a fragment thereof. 

1 6, The composition, according to claim 1, wherein said polynucleotide further 

2 comprises a nucleic acid vaccine vector. 

1 7. The composition, according to claim 1, further comprising a pharmaceutically 

2 acceptable carrier. 

1 8. A polynucleotide encoding a polypeptide having an amino acid sequence selected 

2 from the group consisting of SEQ ID NO. 4, SEQ ID NOS. 14-20, SEQ ID NOS. 23-24, and 

3 fragments thereof. 
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1 9. The polynucleotide, according to claim 8» said polynucleotide having a nucleic acid 

2 sequence selected from the group consistmg of SEQ ID NO. 3, SEQ ID NOS. 7-13, and SEQ 

3 ID NOS. 21-22. 

1 10. A method for protecting a susceptible animal host against disease or death caused 

2 by a rickettsial pathogen, said method comprising administering an effective amount of a 

3 polynucleotide encoding polypeptide having the characteristic of eliciting an immune response 

4 protective against said rickettsial pathogen. 

1 11. The method, according to claim 10, wherein said rickettsial pathogen is selected 

2 from the group consisting of Rickettsia spp., Ehrlichia spp., Anaplasma spp., and Cowdria spp. 

1 12. The method, according to claim 10, wherein said polypeptide has an amino acid 

2 sequence selected from the group consisting of SEQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, 

3 SEQ ID NO. 14, SEQ ID NO. 15, SEQ ID NOS. 16-20, SEQ ID NO. 23, and SEQ ID NO. 24, 

4 or a fragment thereof 

1 13. The method, according to claim 10, wherein said polynucleotide has a nucleic acid 

2 sequence selected from the group consisting of SEQ ID NO. 1 , SEQ ID NO. 3, SEQ ID NO. 5, 

3 SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NOS. 9-13, SEQ ID NO. 21, and SEQ ID NO. 22. 

1 1 4. 'Ilie method, according to claim 13, wherein said polynucleotide has the nucleic acid 

2 sequence of SEQ ID NO. 1. 

1 15. The method, according to claim 13, wherein said polynucleotide has the nucleic acid 

2 sequence of SEQ ID NO. 3 . 

1 16. The method, according to claim 13, wherein said polynucleotide has the nucleic acid 

2 sequence of SEQ ID NO. 5 . 

1 17. The method, according to claim 10, wherein said nucleic acid further comprises an 

2 appropriate nucleic acid vector. 
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1 1 8. llic method, according to claim 10, wherein said composition further compnses a 

2 pharmaceutical ly acceptable earner. 

1 19. A method for detecting, in a human or animal, antibodies associated with infection 

2 by Ehrlichia, wherein said method comprises contacting a biological fluid from said human or 

3 animal with a polypeptide selected from the group consisting of SEQ ID NO. 4, SEQ ID NOS. 

4 14-20, SEQ ID NOS. 23-24, and fragments thereof. 
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1 ggaarcaactcagggacacrrccacrcrtaaagcgc-cgcticaccaccacscgcagcca 

SI c!:=caaacc-aocaacagt:aacacrgagcgcgcgccact-cggagcagaac::::ggaggaa 
PDLATVTLSVCHFGVSLGGR 
121 caccraacccctaacr::taccactgccacacgtwaaaaacaac=t:aaac^r3Ctcccatt 

181 « r a ji A raa ragegaemM^aacc-accaacaaaaoQCgsgQaoagga 

241 ccaacraccatccgc=acatrccctaccaccaccracacraaacaacczgacaaacacaa 
301 cagcccc^ggagaaacaaacaacaczcaaacctczc::iacaaaaac=attta|«ccz::gi: 

361 acraaaaaccagct:t«tm£eLCgzrzrt:at:a::rgt:aggc-raccac^gCtaarcrgtc!:^ 
-10 

421 cac-acrscagjt^zaacacgaaccgcgaaaaactrcctacaacaacrgcacraacacra 

RBS MNCEKrrlTTALTL 
481 ctaatgrcccrcttacccggaat-atcacczticcgatrcagcacaggacgacaacatragr 

LM3PLPGISLSDPVQDDNIS 
541 ggcaaccrccacaccagcggaaagcacacgccaagcgccccgcancc^ggagctrtttct 

GNFYrSGKTMPSASHPGVPS 
601 gccaaggaagaaagaaacacaacagctggagtatctiggaacagagcaagacrgggacaga 

AKESRNTTVGVFGI EQOWDR 
661 rgrgr>meatigr:aaaacgact:rraaacQacacacccaccQCtccaaa c::a£rjacr£aacr 

CVISRTTLSDIFTVPNYSFK 
721 cac?aa aaLaatccacr::rcaggatrrgcagsagccaczggctacr,caacggatggccca 

y i NLFSGFAGAIGYSMDGP 
781 agaacagagctzgaagtarcrTacgaagcac^cgatgccaaaaaccaaggtaacaactac 

RIELEVSYEAFDVKNQGNNY 
841 aagaacgaagcacatagaLaCtacgctstgtcccaticcccrrggcacagagacacagata 

KNEAHRYYALSHLLGTETQI 
901 gacggcgcaggcagcgcgccTigrcrzcccaacaaacgaaggaciaczcgacaaaccacrt 

DGACSASVPLINSGLLDKSP 
961 acgccgaacgcacgttiacgacgcaataagtgaaggcacaccccrrrc-s^rracacacgc 
MLNACYDVISEGIPPSPYIC 
1021 ocaQacacrgqcactqactzagtacccat gc ^cj-aac-cra caaaccrr aaaactccutac 

AGIGIDLVSMFEAINPKISY 
1081 caaggaaaactaggccraagczaccccacaagcccagaagccrctgtgtrtaccggtgga 

QGKLGLSYPISPEASVFIGC 
1141 carrCwCacaaggtgataggaaacgaatrtagagacact^ctactacgacacstagcgaA 

HFHKVIGNEFRDIPTMIPSE 
1201 ccagcgcrtgcaggaaaaggaaactaccccgcaatagtaacactggacgcgcsccaccct 

SALAGKGNYPAIVTLOVFYP 
1261 ggcacagaacctggaggaaggnr£aac!:rccaaccccgatt«tt«oe»eaacaaataaaa 

GIELGGRFNPQL* 
1321 ^caq^og'"^T**'tr^argf:aQgAAtaa aaaoaaaaacooqao aac^aaatcat:;attrgcc 
1381 acacccctcaccaccacctacaccaaataacccgacaaacacaacagctcaaAcaaaggc 
1441 aaacaacccctaaacccgtcricacgagaaccattjrataccctacatraaaaac cage eta 

-35 

1501 taacctgcctctacacrgcagctccacracrgctaatctaticccactactttajgtgca 

rio" RBS 
1561 acacgaactgcaaaaaactrrttataacaaccgcacragcatcactaacgtcecccccac 

MNCKKPPITTALVSLMSFLP 
1621 ccggaacaccaccctctgacccagtgcaaggcgacaatattagtggcaactrccacgcca 

GISPSOPVQGDNISGNFYVS 
1681 gcggcaagtacacgccaagcgccccgcaccttggcangctrcctgccaaagaagaaaaaa 

GKYMPSASHFGMPSAKEEKN 
1741 accccaccgccgcaccgtacggcrcaaaacaagaccgggaagggacragctcaccaagcc 

PTVALYGLKQDWEGISSSSH 
1801 acaaCQataatcaCttcaataaeaaggg t£ac£ca trcaaacatcaa aacaacccacttt 

NDNHPNNKGYSFKYENNPFL 
1861 tagggcccgcaggagctactggttaticaacgggtggcccaagagcagagtttgaagcgc 

GFACAIGYSMGGPRVEFEVS 
1921 cccatgaaacatrtgacgttaaaaatcagggtaacaactataaaaatgatgcccacagac 

YETFDVKNQGNNYKNDAHRY 
1981 actgcgcrtcaggtcaacaagacaacagcggaatacctaaaactagtaaatacgtactgt 

CALGQQDNSGI PKTSKYVLL 
2041 taaaaagcgaaggactgcctgacacatcactracaccaaatgcacgccatgatacaacaa 

kseglldisfm'lnacydi in 

2101 acgagagcataccttrgtctccccacatatgtgcaggcgctggtActgatcCaatatcca 
ESIPLSPYICAGVGTDLISM 

21 61 tgtctgaagcsacaaa cgg taaaacctcgtaceaagggaagccaggcesaagccacccca 
FEATNPKISYQGICLGLSYSI 

2221 taaacccagaagctcccgtacctactggcggacacctccacaaggcgacaggaaacgaac 
NPEASVPIGGHPHKVIGNEP 

2281 utagggacatticcraccccgaaagcacctgccacgccaccagccaccccagacctagcaa 

RDi ptlkapvtssatpdla: 
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^341 zagcaacaccaagcccacgtcaccctggaacagaactcggaggaaggccraacirccaac 

VT-5VCHrG I SLGSaFN?' 
2401 CwtgctacrgccacaccttiaaaaaLaaccraaacccgtccrcattattffcracagcaaac 
2461 aia^mtmtT'qtr'^^^^tTaacgLagcaacaa QaaoqoccaapcocQ ac taaactgccacrr 

2 521 accacaccccccaccaraccacccacacnaaacaacctgacaaacacaacagctrcrgga 
2581 aaaacaaacaacacctaaac!:ccccccacaaaaaccattt«t«c=c^gracraaaaacca 

-35 

2641 qct tatmae ccarrrzzacactgt:agzcctac::accgczaatzz2Cirrcacrartrr»g 

-IC 

2701 2t£:;3acacgaat:rccaaaaaarrcrrt:acaacaaccaciw -accacrgc::2arg"rr:cr 

RBS M NCKKFrlTTTLVSLMS? 
2761 cc^accrggaacaccacrztc^gacgcagcacagaacgacaatgctiggcggcaacczcra 

L?GI SFSDAVQNDNVGCNFY 
2821 taccagcgggaaaLatgcaccaagcgctrcacacrrrggcgcatirzcccgciaaacagga 

I S.G K?VPSVSHFGVFSAKQE 
2881 aagaaacacaacaaccggagratttggactaaagcaagacrggcacggcagcacaacacc 

RNTTIGVFGLKQDWDGSTIS 
2941 caaaaatt=t::caqaaaatacat:wraacgtrccaaa c!:ac:::arr::aaacacc^a aacaa 

KNS PENTPNVPNYSFKYENN 
30D1 trccatczccaggczttgcaggaactigccggctacccaacgaacggtccaagaacagagct 

PPLGPAGAVGYLMNGPRIEL 
3061 agaaatgccccacgaaacacrrgacgcgaaaaaccagggcaacaactataagaacgacgc 

BMSYETFDVKNQGNNYKNDA 
3121 ccacaaacac:acgccccaacccacaacagt:gggggaaagc::aagcaacgcaggcgacaa 

HK.YYALTHNSGGKLSNAGDK 
3181 gcctgcttrcctaaaaaacgaaggactiacctgatatatsactratgii^gaacgcacgcca 

FVFLKNEGLLDISLMLNACY 
3241 tgacgcaacaagrgaaggaacacctccctctccrwacacacgcgsacgrgtrggcaccga 

DVISEGI PPSPYICAGVGTD 
3301 cntaacatccar re:: cjaacrcra caaacc ctaaaat" wcrcacraaggaaagrtaggtct 

LI'SMFEAINPKISYQGKLGL 
3361 gagctaccccacaagcccagaagcrtctgccntcgccggcggacacmcataaggsgac 

SY.SI SPEASVFVGGHFHKV: 
3421 agggaacgaactcagagacatrccrgctacgacacccagnaccrcaactcrcacaggraa 

GN.EPR D I PAMI PSTSTLTGN 
3481 tcaccctaccacagtaacactaagcgtacgccaccttggagcggaactrggaggaaggct 

HFTIVTLSVCHPGVSLGGRF 

3 541 taacttttaaccrwaccacrgccacatgctaaaaataatccaaacttgtrtttattacrg 

N P: • 

3601 ct:qcaqqtamafM«T»o€aqcaaaaqaai:qnaqcaataa caacgajaooaQo ac::ag 
3 661 crrataagtactgcrctt-ctcaccctracacacgatactiaracc-aaccagtirz-zcrgc 
3721 cactacccacccgacgLaaLacacraaacrtrccCwacaaaagCwacejacactzzacac 

3781 aaaaatt f tatt ccgacctgcLrtratatgacacctccactacrgcraacttzacccgcc 
-10 

3841 actac f qgtta catacgaatracaaaaaagtttreacaacaagggeatrgatatcatta 

BBS MNYKKVFITSALISL 
3901 acatcttctciacctggagcaccacrtcccgacccagcaggcagcggcattaacggtaat 

ISSLPGVSFSDPAGSGINGN 
3961 crccacaccagtggaaaacacatgccaagtgcttcgcactrcggagtaccctccgcraag 

FYISGKYMPSASHFGVFSAK 
4021 gaagaaagaaacacaacagctggagcgctrggaccgaagcaaaactgggacggaagcgca 

EERNTTVGVFGLKQNWDGSA 
4081 atatccaactcctccccaaacqatqtattcactqcctcaaa cracrra^rraaacatcaa 

ISNSSPNDVPTVSNYSFKYE 
4141 aacaacccgcccttaggctccgcaggagctattggttiacccaatggatggtccaagaata 

NNPPLGFAGAIGYSMDGPRI 
4201 gagcctgaagraccctacgaaacactcgatgtaaaaaaccaaggtaacaaccacaagaac 

ELEVSYETFDVKNQGNNYKN 
4261 gaagcacatagacactgtgctctatcccacaactcagcagcagacatgagcagtgcaagt 

EAHRYCALSHNSAADMSSAS 
4321 aacaaccrrgccccrctaaaaaatgaaggactacttgacataccatrtacgctgaacgca 

NNFVFLKNEGLLDISFMLNA 
4381 tgctacgacgtagtaggcgaaggcacaccrtcccccccccaratatgcgcaggtatcggc 

CYDVVGEGI PFSPYICAGIG 
4441 actqatrtaqT:at:ccat: gcrtCTaao'ccacaaaccc taaaatttcCwaccaaggaaagtra 

TDLVSHFEATNPKISYQGKL 
4501 ggtttaagctactctacaagcccagaagctCccgcgtrtaLLggtgggcactrtcacaag 

GLSYSISPEASVFIGGHFHJC 
4561 gtaacagggaacgaacttagagacactcctaccacaatacctaccggaccaacacrtgca 

VIGNEFRDIPTIIPTGSTLA 
4621 ggaaaaggaaactaccccgcaatagcaa&accggacgcacgccacccrggaacagaaacg 

CKGNYPAIVILDVCHFGIEM 
4681 gga 

G 
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1 tartacaaacacaaaacataaaaaaacc:;-cacagcaac-gcactagtaczac-aacti= 
M 'k Y K K T r T V 7 A L V L L ? S 
61 crrracacaLtt:tacaccmrtaLacL==agcacatgc:=aqt:acaacicacaacr::=ra 

121 caccagcagaaaacatacgccaacagcgccacatzrccgaatrrrrrcagczaaagaaga 

ISGKYMPTASHFGIFSAKEE 
181 acaaagctrtactaacgtiactacccggcctagaccaacgac-accacataacac-acaaa 

OS?tk''vlvgldqrlshn: in 

241 caaLaaLQacacagcaaagagcrcraagg::ccaaaattacc::acrr.aaat.acaaaaa^aa 

NNDTAKSLKVQNVSFKYKNN 
301 cccaccrccagaatttocaggagccactggccactcaacaggcaacrcaagaacagaact 

PFLG^FAGAIcySI GNSR 1 EL 
361 agaagcaccacatgaaatacr-garaccaaaaacccaggaaacaac-ac-taaatgactc 

EVSHEIFDTKNPGNNYLNDS 
421 tcacaaacacsgcgcLiratcrcatggaagtcacacacgcagtgatggaaatagcagaaa 

HKYCALSHGSHICSDGNSGD 
481 ccggracactgcaaaaactga£aagrtrgt:act:tc::gaaaaacgaaggrr::ac-tgacgc 

WYTAKTDKPVLLKNEGLLDV 
5<41 cccacccacgctaaacgcatgtiacgacawaacaacrgaaaaaacgccTirtrrracrtta 

SFMLNACYDITTEKMPFSPY 
601 tacatgcgcaagcacrggtactgatcLcatatccatgcttgagacaacacaaaacaaaac 

ICAGIGTDLISMFETTQNKI 
661 anctcaL=aaggaaagccaggcrt:aaactataciataaac:icaagaccrtc::grrLttgc 

SYQGKLGLNYTINSRVSVFA 
721 aggtgggcactttcataaggtaataggtaatgaatriaaa9gtatrcctac^ct.ai:racc 

GGHFHKVIGNEFKGI PTLLP 
781 tgatggatcaaacactaaagt:acaacagt:ct:gcaacagtaacatr.agatgcgcgccaLrt 

DGSNIKVCQSATVTLDVCHr 
841 cggg::tagagattggaagLagacrrrtcrrtraawact:tccaCigcacatgc-aaaaata 

GLEIGSRFFF' 
901 gtactagcttgcrcccgtggczia::aaacgcaagagagaaatagctag^aacaaactaga 
961 aagztaaacacragaaaagtcatacgtttctcatrgccactgacacccaaccaaaacrtao 
1021 t^taaatgttacttactaataanttracgcagcacaccaaattrccctracaaaagccac 
1081 t£2t«tcccatactaaaagct«tmctrtggctrgcatcraacctgcacrrtrac:;accgc 

-35 -10 
1141 caatrcactctcactgtttctgjrtjtaaatacgaacrgtaaaaaagrtttcacaataagc 

RES MNCKKVFTIS 
1201 gcactgatatcatccatatacttcctacctaatgrcccatactctaacccagratatggt 

ALISSIYPLPNVSYSNPVYG 
1261 aacagcatgtatggtaatcrttacacaccaggaaagtacatgccaagtgttccrcartct 

NSMYGNFYISGKYMPSVPHF 
1321 ggaauccrtLcagctgaagaagagaaaaaaaagacaactgcagtatatggcrtaaaagaa 

GIFSAEEEKKKTTVVYGLKE 
1381 aactgggcaggagacgcaatatc^gccaaagtccagatgacaatcrtaccazrcgaaat 

NWAGDAISSQSPDDNFTI RN 
1441 tacccatzcaagtacgcaagcaacaagtcrctagggtttgcagtagctatrggttactcg 

ySFKYASNKPLGFAVAIGYS 
1501 acaggcagtccaagaatagaagttigagacgccttatgaagcacttgatgTiaaaaaatcaa 

IGSPRIEVEMSYEAPDVKNQ 
1561 ggcaacaatt 
G N N 
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1 acatgtatacattatagraacaaatgttaccgtattrtattcataagttaagtaaaatct 

61 ataccattctctttcactttatcagaagacttttatttatcacaaactcatgacgtatag 

121 tgtcacaaataaacacactgcaactgcaatcactacgtiaaaacttraactcttctttttc 

181 acaactaaaatactaataaaagtaatatagtataaaaaatcttaaataac TTGACA taat 
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241 attactctgatalASCAIatgtctagtatctctatactaaacgrttatataatr^g^ca 
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301 t a 1 1 aATGAAAGCTATCAAATTCATACTTAPLTGTCTGCTTACTATTTGCA.GCA;^.TATTTT 
MKAIKFILNVCLLFA-4AIFL 

361 TAGGGTATTCCTATATTACAAAACAAGGCJVTATTTCAAACAAAACATCATGATA^^ 

GYSYITKQGIFQTKHKDTPN 

421 ATACTACTATACCAAATGAAGACGGTATTCAATCTAGCTTTAGCTTAATCAATCAAGACG 
TTIPNEDGIQSSFSLINQDG 

481 GTAAAACAGTAACCAGCCAAGATTTCCTAGGGAAACACATGTTAGTTTTGTTTGGATTCT 
KTVTSQDFLGKHMLVLFGFS 

541 CTGCATGTAAAAGCATTTGCCCTGCAGAATTGGGATTAGTATCTGAAGCACTTGCACAAC 
ACKSICPAELGLVSEALAQL 

601 TTGGTAATAATGCAGACAAATTACAAGTAATTTTTATTACAATTGATCCAAAAAATGA^ 
GNNADKLQVIFITIDPKNDT 

661 CTGTAGAAAAATTAAAAGAATTTCATGAACATTTTGATTCAAGAATTCAAATGTTAACAG 
VEKLKEFHEKFDSRIQMLTG 

721 GAAATACTGAAGACATTAATCAAATAATTAAAAATTATAAAATATATGTTGGACAAGCAG 
NTEDINQIIKNYKIYVGQAD 

781 ATAA^GATCATCAAATTAACCATTCTGCAATAATGTACCTTATTGACAAAAAAGGATC^ 
KDHQINHSAIMYLIDKKGSY 

841 ATCTTTCACACTTCATTCCAGATTTAAAATCACAAGAAAATCAAGTAGATAAG7TACTAT 
LSHFI PDLKSQENQVDKLLS 

901 CTTTAGTTAAGCAGTATCTGTAAtttaataattaatt AAAG aaaatagtagacarT'^Trti- 
LVKQYL* ^ ^ 

961 ataaattcatggaatacgttggatgagtaggttttttttagtatttttagrgctaataac 
1021 attggcat ^ ^ 
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721 AGGCAGTGCAGAAGATATTGyW^AAATAATAAAAAATTACAAAATATATGTTGGAC^ 
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781 AGATAAAGATAATCAAATTGATCACTCTGCCATAATGTACATTATCGATAAAAAAGGAGA 
DKDNQIDHSAIMYIIDKKGE 
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901 ATCTATAATAAAACAATATCTCTAAtttaataattaatt aAAGAG aataatacaca CTCT 
SIIKQYL* 

961 latataaattcatggatatatgtgatgggtagatttcttttggtgtttctatcgctaatt 
1021 acatta 
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